Query Size Estimation using Systematic Sampling
نویسندگان
چکیده
In this paper, we propose a new approach to the estimation of query size for select and join operations. The technique, which we have called \systematic sam-pling", is a novel variant of the sampling approach, which sorts the relation before sampling, and which maintains a summary relation to improve run-time performance. We compare the method to a number of existing methods to tackle the problem of query size estimation, and demonstrate, with extensive experimental results, that it performs better than existing approaches over a wide range of data sets.
منابع مشابه
Query Result Size Estimation Techniques in Database Systems
Query optimisers are critical to the efficiency of modern relational database systems. If a query optimiser chooses a poor query execution plan, the performance of the database system in answering the query can be very poor. In fact, the differences in cost between the least and most expensive query execution plans can be several orders of magnitude. On the other hand, it can be prohibitively e...
متن کاملThe Golden Estimator : E cient Range Query
Query size estimation is crucial for many database system components. In particular, query optimizers need eecient and accurate query size estimation when deciding among alternative query plans. In this paper we propose the Golden Estimator, which is based on the so called golden rule of sampling proposed by von Neumann, for estimating the size of single dimensional range queries. The Golden Es...
متن کاملQuery Estimation by Adaptive Sampling
The ability to provide accurate and efficient result estimations of user queries is very important for the query optimizer in database systems. In this paper, we show that the traditional estimation techniques with data reduction points of view do not produce satisfiable estimation results if the query patterns are dynamically changing. We further show that to reduce query estimation error, ins...
متن کاملSelectivity Estimation of High Dimensional Window Queries via Clustering
Query optimization is an important functionality of modern database systems and often based on estimating the selectivity of queries before actually executing them. Well-known techniques for estimating the result set size of a query are sampling and histogram-based solutions. Sampling-based approaches heavily depend on the size of the drawn sample which causes a trade-off between the quality of...
متن کاملEstimation Methods for the Size of Deep Web Textural Data Source: A Survey
The estimation of the size of deep web data sources has been an open problem since 1998. This survey reviews all papers that were available online, and other, resources, on estimating the size of data sources during the period 1998 to 2008. In the survey, we first clarify several basic terms that are used in the survey but whose meanings vary in the literature. Basic models in the literature on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996